A hopefully comprehensive list of currently 283 tools used in corpus compilation and analysis.
This list is kept up to date by its users. Hence, please feel free to contribute by suggesting new tools.
You can also make suggestions, e.g., corrections, regarding individual tools by clicking the ✎ symbol. As this is a non-commercial side (side, side) project, checking and incorporating updates usually takes some time.
There is also a comprehensive list of all tags in the database.
Tool | Description | Tags | Platforms | Pricing |
---|---|---|---|---|
AntCorGen ✎ | A freeware discipline-specific corpus creation tool. | compilation, text analysis | Windows, Mac, Linux | Free |
AntGram ✎ | A freeware n-gram and p-frame (open-slot n-gram) generation tool. | text analysis, n-grams, p-frames, lexical bundles, lexical frames | Windows, Mac, Linux | Free |
AntMover ✎ | Tool for text structure (moves) analysis | text analysis | Windows | Free |
Bow ✎ | Statistical Language Modeling, Text Retrieval, Classification and Clustering | text analysis | UNIX, Linux | Free |
Chared ✎ | Tool for detecting the character encoding of a text | text analysis | Python 2.6 or later | Free |
CorpusExplorer ✎ | A complex corpus analysis toolkit combining 45 interactive tools. | visualization, exploration, tagging, text analysis | Windows | Free, Open Source |
Corpustools ✎ | An R package for managing, querying, and analyzing texts. | text analysis, R | R | Free, Open Source |
DocuScope ✎ | A tool for computer-aided rhetorical anyalysis | rhetorical analysis, text analysis, visualization | Windows (Java) | Free |
Online Graded Text Editor ✎ | Tool for profiling a text's vocabulary level and complexity | text analysis, editing, vocabulary | OSX, Windows | Free |
pysupersensetagger ✎ | Analyses texts for MWE and supersenses. | text analysis | Unix, Mac (Python) | Free |
QDA Miner ✎ | A commercial QDA tool for coding, annotating, retrieving and analyzing collections of documents and images. | qda, mixed methods, text analysis | Windows | Commercial |
QualCoder ✎ | QualCoder is free, open source software for qualitative data analysis. | qda, text analysis | Linux, Mac, Windows | Free, Open Source |
Sketch Engine ✎ | A corpus manager and text analysis software developed by Lexical Computing. | annotation, concordancer, tagging, sampling, search, visualization, wordlists, keywords, compilation, text analysis, n-grams, collocation, statistics, segmentation, analysis, crawler, parallel, colligation, annotations, tokenization, query, ngrams, boilerplate remover, comparison, frequency analysis, information retrieval, data, sentence boundary, corpus creation, duplicate remover, regex, thesaurus, meta modelling, dictionary, text-processing, xml, frequency, trends patterns, web-based, collocates, collocation analysis, word cloud, coocurence, KWIC, corpus management, multilingual, NLP, diachronic analysis, term extraction, keyword extraction, bilingual term extraction | 30-day free trial then starts at 4.83 €/month | |
Stylo for R ✎ | Tool for computational stylistic analysis (authorship attribution, genre analysis) | text analysis | Free | |
The Text Feature Analyser ✎ | A tool for investigating textual features and various meassures | text analysis, concordancer | Windows | Free |
TXM ✎ | XML & TEI compatible text analysis software based on TreeTagger, the CQP search engine and the R statistical environment. | text analysis, concordancer, r, statistics, search tool, tokenizer, xml | Windows,Mac,Linux,Tomcat | Free |
Voyant Tools ✎ | A web-based reading/analysis toolkit for digital texts. | reading, text analysis, visualization, trends patterns | Web | Free, Open Source |
FLAX ✎ | FLAX (Flexible Language Acquisition) is a set of tools and applications to automate the production and delivery of interactive digital language collections. | language learning, language teaching, text analysis | Java, Moodle | Free, Open Source |
Orange Data Mining ✎ | An open source machine learning and data visualization platform based on workflows. | text analysis, visualization, time series | Windows, Unix, Linux, Mac | Free, Open Source |
Wordless ✎ | An Integrated corpus tool With multilingual support for the study of language, literature, and translation. | concordancer, text analysis, statistics, readability | Windows, Mac, Linux, Python | Free, Open Source |
Last Updated: October 13, 2024.
In case you are interested, the data is also available in JSON format.